Overview
Brought to you by YData
Dataset statistics
| Number of variables | 10 |
|---|---|
| Number of observations | 690 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 43 |
| Duplicate rows (%) | 6.2% |
| Total size in memory | 54.0 KiB |
| Average record size in memory | 80.2 B |
Variable types
| Numeric | 9 |
|---|---|
| Categorical | 1 |
| Dataset has 43 (6.2%) duplicate rows | Duplicates |
Bare Nuclei is highly overall correlated with Bland Chromatin and 7 other fields | High correlation |
Bland Chromatin is highly overall correlated with Bare Nuclei and 7 other fields | High correlation |
Class is highly overall correlated with Bare Nuclei and 8 other fields | High correlation |
Clump Thickness is highly overall correlated with Bare Nuclei and 7 other fields | High correlation |
Marginal Adhesion is highly overall correlated with Bare Nuclei and 7 other fields | High correlation |
Mitoses is highly overall correlated with Class and 2 other fields | High correlation |
Normal Nucleoli is highly overall correlated with Bare Nuclei and 8 other fields | High correlation |
Single Epithelial Cell Size is highly overall correlated with Bare Nuclei and 7 other fields | High correlation |
Uniformity of Cell Shape is highly overall correlated with Bare Nuclei and 7 other fields | High correlation |
Uniformity of Cell Size is highly overall correlated with Bare Nuclei and 8 other fields | High correlation |
Reproduction
| Analysis started | 2025-01-05 15:04:47.147379 |
|---|---|
| Analysis finished | 2025-01-05 15:04:55.007542 |
| Duration | 7.86 seconds |
| Software version | ydata-profiling vv4.12.1 |
| Download configuration | config.json |
Variables
Clump Thickness
Real number (ℝ)
High correlation 
| Distinct | 10 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.4289855 |
| Minimum | 1 |
|---|---|
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 4 |
| Q3 | 6 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.8173782 |
|---|---|
| Coefficient of variation (CV) | 0.63612271 |
| Kurtosis | -0.62777543 |
| Mean | 4.4289855 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.58938754 |
| Sum | 3056 |
| Variance | 7.9376202 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 142 | |
| 5 | 129 | |
| 3 | 105 | |
| 4 | 80 | |
| 10 | 69 | |
| 2 | 50 | 7.2% |
| 8 | 46 | 6.7% |
| 6 | 33 | 4.8% |
| 7 | 23 | 3.3% |
| 9 | 13 | 1.9% |
| Value | Count | Frequency (%) |
| 1 | 142 | |
| 2 | 50 | 7.2% |
| 3 | 105 | |
| 4 | 80 | |
| 5 | 129 | |
| 6 | 33 | 4.8% |
| 7 | 23 | 3.3% |
| 8 | 46 | 6.7% |
| 9 | 13 | 1.9% |
| 10 | 69 |
| Value | Count | Frequency (%) |
| 10 | 69 | |
| 9 | 13 | 1.9% |
| 8 | 46 | 6.7% |
| 7 | 23 | 3.3% |
| 6 | 33 | 4.8% |
| 5 | 129 | |
| 4 | 80 | |
| 3 | 105 | |
| 2 | 50 | 7.2% |
| 1 | 142 |
Uniformity of Cell Size
Real number (ℝ)
High correlation 
| Distinct | 10 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.1333333 |
| Minimum | 1 |
|---|---|
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 5 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 3.0424508 |
|---|---|
| Coefficient of variation (CV) | 0.97099494 |
| Kurtosis | 0.10094757 |
| Mean | 3.1333333 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.2311166 |
| Sum | 2162 |
| Variance | 9.256507 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 378 | |
| 10 | 65 | 9.4% |
| 3 | 51 | 7.4% |
| 2 | 45 | 6.5% |
| 4 | 40 | 5.8% |
| 5 | 30 | 4.3% |
| 8 | 29 | 4.2% |
| 6 | 27 | 3.9% |
| 7 | 19 | 2.8% |
| 9 | 6 | 0.9% |
| Value | Count | Frequency (%) |
| 1 | 378 | |
| 2 | 45 | 6.5% |
| 3 | 51 | 7.4% |
| 4 | 40 | 5.8% |
| 5 | 30 | 4.3% |
| 6 | 27 | 3.9% |
| 7 | 19 | 2.8% |
| 8 | 29 | 4.2% |
| 9 | 6 | 0.9% |
| 10 | 65 | 9.4% |
| Value | Count | Frequency (%) |
| 10 | 65 | 9.4% |
| 9 | 6 | 0.9% |
| 8 | 29 | 4.2% |
| 7 | 19 | 2.8% |
| 6 | 27 | 3.9% |
| 5 | 30 | 4.3% |
| 4 | 40 | 5.8% |
| 3 | 51 | 7.4% |
| 2 | 45 | 6.5% |
| 1 | 378 |
Uniformity of Cell Shape
Real number (ℝ)
High correlation 
| Distinct | 10 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.2043478 |
| Minimum | 1 |
|---|---|
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 5 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.9608444 |
|---|---|
| Coefficient of variation (CV) | 0.92400842 |
| Kurtosis | 0.013496459 |
| Mean | 3.2043478 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.1616176 |
| Sum | 2211 |
| Variance | 8.7665994 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 347 | |
| 2 | 59 | 8.6% |
| 3 | 56 | 8.1% |
| 10 | 56 | 8.1% |
| 4 | 44 | 6.4% |
| 5 | 33 | 4.8% |
| 7 | 30 | 4.3% |
| 6 | 30 | 4.3% |
| 8 | 28 | 4.1% |
| 9 | 7 | 1.0% |
| Value | Count | Frequency (%) |
| 1 | 347 | |
| 2 | 59 | 8.6% |
| 3 | 56 | 8.1% |
| 4 | 44 | 6.4% |
| 5 | 33 | 4.8% |
| 6 | 30 | 4.3% |
| 7 | 30 | 4.3% |
| 8 | 28 | 4.1% |
| 9 | 7 | 1.0% |
| 10 | 56 | 8.1% |
| Value | Count | Frequency (%) |
| 10 | 56 | 8.1% |
| 9 | 7 | 1.0% |
| 8 | 28 | 4.1% |
| 7 | 30 | 4.3% |
| 6 | 30 | 4.3% |
| 5 | 33 | 4.8% |
| 4 | 44 | 6.4% |
| 3 | 56 | 8.1% |
| 2 | 59 | 8.6% |
| 1 | 347 |
Marginal Adhesion
Real number (ℝ)
High correlation 
| Distinct | 10 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.8275362 |
| Minimum | 1 |
|---|---|
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 4 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.8677874 |
|---|---|
| Coefficient of variation (CV) | 1.0142354 |
| Kurtosis | 0.92519186 |
| Mean | 2.8275362 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.5054065 |
| Sum | 1951 |
| Variance | 8.2242044 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 400 | |
| 3 | 58 | 8.4% |
| 2 | 56 | 8.1% |
| 10 | 55 | 8.0% |
| 4 | 33 | 4.8% |
| 8 | 25 | 3.6% |
| 5 | 23 | 3.3% |
| 6 | 22 | 3.2% |
| 7 | 13 | 1.9% |
| 9 | 5 | 0.7% |
| Value | Count | Frequency (%) |
| 1 | 400 | |
| 2 | 56 | 8.1% |
| 3 | 58 | 8.4% |
| 4 | 33 | 4.8% |
| 5 | 23 | 3.3% |
| 6 | 22 | 3.2% |
| 7 | 13 | 1.9% |
| 8 | 25 | 3.6% |
| 9 | 5 | 0.7% |
| 10 | 55 | 8.0% |
| Value | Count | Frequency (%) |
| 10 | 55 | 8.0% |
| 9 | 5 | 0.7% |
| 8 | 25 | 3.6% |
| 7 | 13 | 1.9% |
| 6 | 22 | 3.2% |
| 5 | 23 | 3.3% |
| 4 | 33 | 4.8% |
| 3 | 58 | 8.4% |
| 2 | 56 | 8.1% |
| 1 | 400 |
Single Epithelial Cell Size
Real number (ℝ)
High correlation 
| Distinct | 10 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.2130435 |
| Minimum | 1 |
|---|---|
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 2 |
| Q3 | 4 |
| 95-th percentile | 8 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 2.2009638 |
|---|---|
| Coefficient of variation (CV) | 0.68500904 |
| Kurtosis | 2.2065476 |
| Mean | 3.2130435 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.7167799 |
| Sum | 2217 |
| Variance | 4.8442418 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 382 | |
| 3 | 71 | 10.3% |
| 4 | 48 | 7.0% |
| 1 | 45 | 6.5% |
| 6 | 41 | 5.9% |
| 5 | 39 | 5.7% |
| 10 | 30 | 4.3% |
| 8 | 20 | 2.9% |
| 7 | 12 | 1.7% |
| 9 | 2 | 0.3% |
| Value | Count | Frequency (%) |
| 1 | 45 | 6.5% |
| 2 | 382 | |
| 3 | 71 | 10.3% |
| 4 | 48 | 7.0% |
| 5 | 39 | 5.7% |
| 6 | 41 | 5.9% |
| 7 | 12 | 1.7% |
| 8 | 20 | 2.9% |
| 9 | 2 | 0.3% |
| 10 | 30 | 4.3% |
| Value | Count | Frequency (%) |
| 10 | 30 | 4.3% |
| 9 | 2 | 0.3% |
| 8 | 20 | 2.9% |
| 7 | 12 | 1.7% |
| 6 | 41 | 5.9% |
| 5 | 39 | 5.7% |
| 4 | 48 | 7.0% |
| 3 | 71 | 10.3% |
| 2 | 382 | |
| 1 | 45 | 6.5% |
Bare Nuclei
Real number (ℝ)
High correlation 
| Distinct | 10 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.5028986 |
| Minimum | 1 |
|---|---|
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 5.75 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 4.75 |
Descriptive statistics
| Standard deviation | 3.6227178 |
|---|---|
| Coefficient of variation (CV) | 1.0342058 |
| Kurtosis | -0.74689471 |
| Mean | 3.5028986 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.0140652 |
| Sum | 2417 |
| Variance | 13.124084 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 410 | |
| 10 | 130 | 18.8% |
| 2 | 30 | 4.3% |
| 5 | 30 | 4.3% |
| 3 | 28 | 4.1% |
| 8 | 22 | 3.2% |
| 4 | 19 | 2.8% |
| 9 | 9 | 1.3% |
| 7 | 8 | 1.2% |
| 6 | 4 | 0.6% |
| Value | Count | Frequency (%) |
| 1 | 410 | |
| 2 | 30 | 4.3% |
| 3 | 28 | 4.1% |
| 4 | 19 | 2.8% |
| 5 | 30 | 4.3% |
| 6 | 4 | 0.6% |
| 7 | 8 | 1.2% |
| 8 | 22 | 3.2% |
| 9 | 9 | 1.3% |
| 10 | 130 | 18.8% |
| Value | Count | Frequency (%) |
| 10 | 130 | 18.8% |
| 9 | 9 | 1.3% |
| 8 | 22 | 3.2% |
| 7 | 8 | 1.2% |
| 6 | 4 | 0.6% |
| 5 | 30 | 4.3% |
| 4 | 19 | 2.8% |
| 3 | 28 | 4.1% |
| 2 | 30 | 4.3% |
| 1 | 410 |
Bland Chromatin
Real number (ℝ)
High correlation 
| Distinct | 10 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.4362319 |
| Minimum | 1 |
|---|---|
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 8 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.4440604 |
|---|---|
| Coefficient of variation (CV) | 0.71126179 |
| Kurtosis | 0.1844237 |
| Mean | 3.4362319 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.1012658 |
| Sum | 2371 |
| Variance | 5.9734314 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 165 | |
| 3 | 160 | |
| 1 | 151 | |
| 7 | 71 | |
| 4 | 40 | 5.8% |
| 5 | 34 | 4.9% |
| 8 | 28 | 4.1% |
| 10 | 20 | 2.9% |
| 9 | 11 | 1.6% |
| 6 | 10 | 1.4% |
| Value | Count | Frequency (%) |
| 1 | 151 | |
| 2 | 165 | |
| 3 | 160 | |
| 4 | 40 | 5.8% |
| 5 | 34 | 4.9% |
| 6 | 10 | 1.4% |
| 7 | 71 | |
| 8 | 28 | 4.1% |
| 9 | 11 | 1.6% |
| 10 | 20 | 2.9% |
| Value | Count | Frequency (%) |
| 10 | 20 | 2.9% |
| 9 | 11 | 1.6% |
| 8 | 28 | 4.1% |
| 7 | 71 | |
| 6 | 10 | 1.4% |
| 5 | 34 | 4.9% |
| 4 | 40 | 5.8% |
| 3 | 160 | |
| 2 | 165 | |
| 1 | 151 |
Normal Nucleoli
Real number (ℝ)
High correlation 
| Distinct | 10 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.8855072 |
| Minimum | 1 |
|---|---|
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 4 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 3.0676823 |
|---|---|
| Coefficient of variation (CV) | 1.0631345 |
| Kurtosis | 0.419946 |
| Mean | 2.8855072 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.4052866 |
| Sum | 1991 |
| Variance | 9.410675 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 436 | |
| 10 | 61 | 8.8% |
| 3 | 42 | 6.1% |
| 2 | 36 | 5.2% |
| 8 | 24 | 3.5% |
| 6 | 22 | 3.2% |
| 5 | 19 | 2.8% |
| 4 | 18 | 2.6% |
| 7 | 16 | 2.3% |
| 9 | 16 | 2.3% |
| Value | Count | Frequency (%) |
| 1 | 436 | |
| 2 | 36 | 5.2% |
| 3 | 42 | 6.1% |
| 4 | 18 | 2.6% |
| 5 | 19 | 2.8% |
| 6 | 22 | 3.2% |
| 7 | 16 | 2.3% |
| 8 | 24 | 3.5% |
| 9 | 16 | 2.3% |
| 10 | 61 | 8.8% |
| Value | Count | Frequency (%) |
| 10 | 61 | 8.8% |
| 9 | 16 | 2.3% |
| 8 | 24 | 3.5% |
| 7 | 16 | 2.3% |
| 6 | 22 | 3.2% |
| 5 | 19 | 2.8% |
| 4 | 18 | 2.6% |
| 3 | 42 | 6.1% |
| 2 | 36 | 5.2% |
| 1 | 436 |
Mitoses
Real number (ℝ)
High correlation 
| Distinct | 9 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.5942029 |
| Minimum | 1 |
|---|---|
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 5 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.7242305 |
|---|---|
| Coefficient of variation (CV) | 1.0815627 |
| Kurtosis | 12.489306 |
| Mean | 1.5942029 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.5414741 |
| Sum | 1100 |
| Variance | 2.9729707 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 571 | |
| 2 | 35 | 5.1% |
| 3 | 32 | 4.6% |
| 10 | 14 | 2.0% |
| 4 | 12 | 1.7% |
| 7 | 9 | 1.3% |
| 8 | 8 | 1.2% |
| 5 | 6 | 0.9% |
| 6 | 3 | 0.4% |
| Value | Count | Frequency (%) |
| 1 | 571 | |
| 2 | 35 | 5.1% |
| 3 | 32 | 4.6% |
| 4 | 12 | 1.7% |
| 5 | 6 | 0.9% |
| 6 | 3 | 0.4% |
| 7 | 9 | 1.3% |
| 8 | 8 | 1.2% |
| 10 | 14 | 2.0% |
| Value | Count | Frequency (%) |
| 10 | 14 | 2.0% |
| 8 | 8 | 1.2% |
| 7 | 9 | 1.3% |
| 6 | 3 | 0.4% |
| 5 | 6 | 0.9% |
| 4 | 12 | 1.7% |
| 3 | 32 | 4.6% |
| 2 | 35 | 5.1% |
| 1 | 571 |
Class
Categorical
High correlation 
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.5 KiB |
| benign | |
|---|---|
| malignant |
Length
| Max length | 9 |
|---|---|
| Median length | 6 |
| Mean length | 7.0347826 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | benign |
|---|---|
| 2nd row | benign |
| 3rd row | benign |
| 4th row | benign |
| 5th row | benign |
Common Values
| Value | Count | Frequency (%) |
| benign | 452 | |
| malignant | 238 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| benign | 452 | |
| malignant | 238 |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 1380 | |
| g | 690 | |
| i | 690 | |
| a | 476 | 9.8% |
| b | 452 | 9.3% |
| e | 452 | 9.3% |
| m | 238 | 4.9% |
| l | 238 | 4.9% |
| t | 238 | 4.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4854 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 1380 | |
| g | 690 | |
| i | 690 | |
| a | 476 | 9.8% |
| b | 452 | 9.3% |
| e | 452 | 9.3% |
| m | 238 | 4.9% |
| l | 238 | 4.9% |
| t | 238 | 4.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4854 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 1380 | |
| g | 690 | |
| i | 690 | |
| a | 476 | 9.8% |
| b | 452 | 9.3% |
| e | 452 | 9.3% |
| m | 238 | 4.9% |
| l | 238 | 4.9% |
| t | 238 | 4.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4854 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 1380 | |
| g | 690 | |
| i | 690 | |
| a | 476 | 9.8% |
| b | 452 | 9.3% |
| e | 452 | 9.3% |
| m | 238 | 4.9% |
| l | 238 | 4.9% |
| t | 238 | 4.9% |
Interactions
Correlations
| Bare Nuclei | Bland Chromatin | Class | Clump Thickness | Marginal Adhesion | Mitoses | Normal Nucleoli | Single Epithelial Cell Size | Uniformity of Cell Shape | Uniformity of Cell Size | |
|---|---|---|---|---|---|---|---|---|---|---|
| Bare Nuclei | 1.000 | 0.675 | 0.840 | 0.592 | 0.690 | 0.475 | 0.656 | 0.682 | 0.744 | 0.764 |
| Bland Chromatin | 0.675 | 1.000 | 0.804 | 0.539 | 0.627 | 0.386 | 0.666 | 0.642 | 0.694 | 0.721 |
| Class | 0.840 | 0.804 | 1.000 | 0.739 | 0.742 | 0.519 | 0.768 | 0.789 | 0.859 | 0.874 |
| Clump Thickness | 0.592 | 0.539 | 0.739 | 1.000 | 0.543 | 0.418 | 0.568 | 0.580 | 0.664 | 0.665 |
| Marginal Adhesion | 0.690 | 0.627 | 0.742 | 0.543 | 1.000 | 0.446 | 0.635 | 0.671 | 0.715 | 0.746 |
| Mitoses | 0.475 | 0.386 | 0.519 | 0.418 | 0.446 | 1.000 | 0.503 | 0.480 | 0.472 | 0.508 |
| Normal Nucleoli | 0.656 | 0.666 | 0.768 | 0.568 | 0.635 | 0.503 | 1.000 | 0.706 | 0.725 | 0.757 |
| Single Epithelial Cell Size | 0.682 | 0.642 | 0.789 | 0.580 | 0.671 | 0.480 | 0.706 | 1.000 | 0.757 | 0.785 |
| Uniformity of Cell Shape | 0.744 | 0.694 | 0.859 | 0.664 | 0.715 | 0.472 | 0.725 | 0.757 | 1.000 | 0.891 |
| Uniformity of Cell Size | 0.764 | 0.721 | 0.874 | 0.665 | 0.746 | 0.508 | 0.757 | 0.785 | 0.891 | 1.000 |
Missing values
Sample
| Clump Thickness | Uniformity of Cell Size | Uniformity of Cell Shape | Marginal Adhesion | Single Epithelial Cell Size | Bare Nuclei | Bland Chromatin | Normal Nucleoli | Mitoses | Class | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 5 | 1 | 1 | 1 | 2 | 1 | 3 | 1 | 1 | benign |
| 1 | 5 | 4 | 4 | 5 | 7 | 10 | 3 | 2 | 1 | benign |
| 2 | 3 | 1 | 1 | 1 | 2 | 2 | 3 | 1 | 1 | benign |
| 3 | 6 | 8 | 8 | 1 | 3 | 4 | 3 | 7 | 1 | benign |
| 4 | 4 | 1 | 1 | 3 | 2 | 1 | 3 | 1 | 1 | benign |
| 5 | 8 | 10 | 10 | 8 | 7 | 10 | 9 | 7 | 1 | malignant |
| 6 | 1 | 1 | 1 | 1 | 2 | 10 | 3 | 1 | 1 | benign |
| 7 | 2 | 1 | 2 | 1 | 2 | 1 | 3 | 1 | 1 | benign |
| 8 | 2 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 5 | benign |
| 9 | 4 | 2 | 1 | 1 | 2 | 1 | 2 | 1 | 1 | benign |
| Clump Thickness | Uniformity of Cell Size | Uniformity of Cell Shape | Marginal Adhesion | Single Epithelial Cell Size | Bare Nuclei | Bland Chromatin | Normal Nucleoli | Mitoses | Class | |
|---|---|---|---|---|---|---|---|---|---|---|
| 680 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 8 | benign |
| 681 | 1 | 1 | 1 | 3 | 2 | 1 | 1 | 1 | 1 | benign |
| 682 | 5 | 10 | 10 | 5 | 4 | 5 | 4 | 4 | 1 | malignant |
| 683 | 3 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | benign |
| 684 | 3 | 1 | 1 | 1 | 2 | 1 | 2 | 1 | 2 | benign |
| 685 | 3 | 1 | 1 | 1 | 3 | 2 | 1 | 1 | 1 | benign |
| 686 | 2 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | benign |
| 687 | 5 | 10 | 10 | 3 | 7 | 3 | 8 | 10 | 2 | malignant |
| 688 | 4 | 8 | 6 | 4 | 3 | 4 | 10 | 6 | 1 | malignant |
| 689 | 4 | 8 | 8 | 5 | 4 | 5 | 10 | 4 | 1 | malignant |
Duplicate rows
Most frequently occurring
| Clump Thickness | Uniformity of Cell Size | Uniformity of Cell Shape | Marginal Adhesion | Single Epithelial Cell Size | Bare Nuclei | Bland Chromatin | Normal Nucleoli | Mitoses | Class | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | benign | 26 |
| 5 | 1 | 1 | 1 | 1 | 2 | 1 | 3 | 1 | 1 | benign | 23 |
| 4 | 1 | 1 | 1 | 1 | 2 | 1 | 2 | 1 | 1 | benign | 22 |
| 19 | 3 | 1 | 1 | 1 | 2 | 1 | 2 | 1 | 1 | benign | 20 |
| 18 | 3 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | benign | 12 |
| 12 | 2 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | benign | 10 |
| 20 | 3 | 1 | 1 | 1 | 2 | 1 | 3 | 1 | 1 | benign | 10 |
| 25 | 4 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | benign | 10 |
| 26 | 4 | 1 | 1 | 1 | 2 | 1 | 2 | 1 | 1 | benign | 10 |
| 34 | 5 | 1 | 1 | 1 | 2 | 1 | 2 | 1 | 1 | benign | 10 |